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This note presents an elementary derivation of the covariances of the c(c — lj/2 two-sample rank 
sum statistics computed among all pairs of samples from c populations. 
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Mann-Whitney or Wilcoxon rank sum statistics, computed for some or all of the c(c — \)I2 pairs 
of samples from c populations, have been used in testing the null hypothesis of homogeneity of dis- 
tribution against a variety of alternatives [1,3,4, 5]. 1 This note presents an elementary derivation of 
the covariances of such statistics under the null hypothesis. 

The usual approach to such an analysis is the rank sum viewpoint of the Wilcoxon form of the 
statistic. Using this approach, Steel [3] presents a lengthy derivation of the covariances. In this note 
it is shown that thinking in terms of the Mann-Whitney form of the statistic leads to an elementary 
derivation. For comparison and completeness the rank sum derivation of Kruskal and Wallis [2] is 
repeated in obtaining the means and variances. 

Let Xi r , r= 1,2, . . ., m 9 i= 1,2, . . ., c, be the rth item in the sample of size n\ from the ith of c 
populations. Let My be the Mann- Whitney statistic between the ith and 7th samples defined by 

Mv= £ £ 4? (D 

8=1 r=l 

where 

Thus Mjj is the number of times items in they'th sample exceed items in the t'th sample. Let Wy be 
the Wilcoxon rank sum statistic defined by 

r y = I R u (x]), (2) 

where Ri/xf) is the rank of xf in the combined ith andy'th samples. Then My and W,j are related by 

Mi^Wij-n/nj+W. (3) 
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Obviously E(My) = E(W\j) — nj(rij+ 1J/2, the variances of My and Wy are equal and the co variances 
among the M tj are equal to the covariances among the W%j. 

THEOREM: Under the null hypothesis, the Mann-Whitney form of the statistic has the following 
moments: 

E(M ij ) = n i n j /2 

V(M u )=n 1 n j (n i + n j +l)/12 

C(M lj ,M lk )=C(M ji ,M ki )=n 1 n j n k /12l 

\ ljk ail different 

C(M ij ,M ki )--n i n j n k /12 J 

C(Mij, M kl ) = ijkl all different. 

PROOF: Consider the population consisting of one of each of the integers 1,2, . . ., Nij—ni~\-nj. 
The population mean and variance are (Ny+ l>/2 and (TV*/ - 1J/12, respectively. Now let Ry be the 
mean of a random sample of size nj drawn without replacement from the population. 

Then 

E(R ij ) = (N ij +l)l2, (4) 

and 

(W-l) [Nij-nf 



(W.-l) \-I\ij-rij 



W 7 +i)(/v u -^) 

(5) 



I2nj 



where the quantity in brackets is sometimes called the finite sample correction factor. Under the 
null hypothesis, the distribution of Wy is identical to that of UjRij. The mean and variance of My 
follow immediately. 

To obtain the covariances, let Mi(j+k) be the Mann-Whitney statistic for the combined 7th and 
Ath samples compared to the ith sample. Recalling that M a i> is the number of times items in the 6th 
sample exceed items in the ath sample, it is easily seen that Mi(j+ k ) = Mij + Mi k . Now, since 
nj+k — nj+n k , 

V(M mk) )=V(Mij + M ik ) = m(nj + n k ) (m + nj+n k + lj/12. (6) 



But 



It follows that 



V(Mij + M ik ) = V(Mij) + V(M ik ) + 2C(M ih M ik ). (7) 



C(M ih M ik )=n i n j n k ll2. (8) 

The remaining covariances follow from the identity Mij=ntnj — Mji. 
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